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(54) Method of and apparatus for delecting a human face and observer tracking display 



(57) A method Is provided lor detecting a human 
face In an Image, such as a sequence of images sup- 
plied by a video camera (3). The method comprises lo> 
eating (17) In each Irmge a candidate face region and 
analysing (18) the candidate face region for a first char- 
acteristic indicative of a facial feature. The locating step 
(17) may comprise detecting (S23) uniformly saturated 
regions o1 predetermined shape in a reduced resolution 
version of the image. The analysing step (18) may com- 
prise selecting a single colour component (810), form- 
ing a vertical integral projection profile and detecting 
(831 ) an omega shape in the profile characteristic of an 
eye regton of a face. 
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Descriptioh 

[0001] The present invention relates to a method ot and an apparatus tor detecting a human face. Such a method 
may, for example, be used for capturing a target In^ge In an initialisation stage of an image tracking system. The 
s present invention also relates to an obsen/er tracking display, lor iriBtance of the autostereoscopic type, using an image 
tracking system Including such an apparatus. 

[0002] Other applications of such methods and apparatuses include security sun/elllance, video and Image conv 
presslon, video conferencing, multimedia database searching, computer games, driver monitoring, graphical user in- 
terfaces, face recognition and personal identification. 
10 [0003] Autostereoscopic displays enable a viewer to see two separate images forming a stereoscopic pair by viewing 
such displays with the eyes in two viewing windows. Examples of such displays are disclosed in EP 0 602 934, EP 0 
656 555, EP 0 708 351 , EP 0 726 482 and EP 0 829 743. An example ot a known type of obsewer tracking autoster- 
eoscopic display is illustrated in Figure 1 ot the accompanying drawings. 

[0004] The display comprises a display system 1 co-operating with a tracking system 2. The tracking system 2 com- 
IS prises a tracking sensor 3 which supplies a sensor sigr^l to a tracking processor 4. The tracking processor 4 derives 
from the sensor signal an obsen^er positton data signal which is supplied to a display control processor 5 of the display 
system 1 . The processor 5 converts the position data signal into a window steering signal and supplies this to a steering 
mechanism 6 of a tracked 3D display 7. The viewing windows lor the eyes of the observer are thus steered so as to 
follow movement ot the head at the obsen^er and, within the working range, to maintain the eyes of the observer in the 
^ appropriate viewing windows. 

[0005] QB 2 324 428 and EP 0 877 274 disclose an obsen^er video tracking system which has a short latency time, 
a high update frequency and adequate measurement accuracy for observer tracking autostereoscopic displays. Figure 

2 of the accompanying drawings Illustrates an example of the system, which differs from that shown In Figure 1 erf the 
accompanying drawings in that the tracking sensor 3 comprises a Sony XC999 NTSC video camera operating at a 60 

^ Hz field rate and the tracking processor 4 is provided with a mouse B and comprises a Silicon Graphics entry level 
machine of the Indy series equipped with an R4400 processor operating at 150 IMhz and a video digitlser and frame 
store having a resolution of 640 x 240 picture elements (pixels) for each field captured by the camera 3. The camera 

3 is disposed on lop of the display 7 and points towards the observer who srts In Ironl of the display The normal distance 
between the obsen^er and the camera 3 is about 0.85 metres, at which distance the observer has a freedom of move- 
so ment in the lateral or X direction of about 450mm. The distance between two pbcels In the Image lomned by the camera 

con^esponds to about 0.67 and 1.21 mm In the X and Y directbns, respectively. The Y resdutton Is halved because 
each interlaced fiekJ is used indrvidualty. 

[0006] Figure 3 of the accompanying drawings Illustrates In general terms the tracking method performed by the 
processor 4. The method comprises an initialisation stage 9 followed by a tracking stage 10. During the initialisation 

3S stage 9, a target image or template' Is captured by storing a portion of an Image from the camera 3. The target image 
generally contains the obsen/er eye region as illustrated at 11 in Figure 4 of the accompanying drawings. Once the 
target image or template.11 has been successfully captured, obsen/sr tracking is performed in the tracking stage 10. 
[0007] A global target or template search is performed at 12 so as to detect the position of the target image within 
the whole image produced by the camera 3. Once the target image has been located, motion detection is performed 

^ at 13 after which a local target or template search is perfomried at 14. The template matching steps 12 and 14 are 
performed by cross-correlating the target image in the template with each sub-eection overlaid by the template. The 
best correlatkm value Is compared with a predetermined threshold to check whether tracking has been lost In step 1 5. 
If so, control retums to the gtobal template matching step 12. Otherwise, control returns to the step 13. The motion 
detection 13 and the local template matching 14 form a tracking loop which is performed for as long as tracking is 

^ maintained. The motion detectton step supplies positton data by a differenllal method which determines the nrKrvement 
of the target image between consecutive fields and adds this to the positbn found by local template matching in the 
preceding step lor the earlier field. 

[0008] The initialisation stage 9 obtains a target image or a template of the obsen/er before tracking starts. The 
initialisation stage disclosed in GB 2 324 428 and EP 0 877 274 uses an interactive method in which the display 7 
so displays the Incoming video Images and an Image generator, for example embodied In the processor 4, generates a 
border innags or giaphbal guide 16 on the display as illustrated in Figure 5 of the accompanying drawings. A user- 
operable control, for Instance torming part of the mouse 6, altows manual actuation of capturing of the Image regkin 
within the border image. 

[OOC^] The obsewer views his own Image on the dfeplay 7 together with the border Image whteh is of the required 
template size. The obsen^er aligns the midpoint between his eyes with the mkJdle line of the graphical guide 16 and 
then activates the system to capture the template, for instance by pressing a mouse button or a keyboard key Altar- 
natlvety. this alignment may be achieved by dragging the graphk»l guide 16 to the desired place using the mouse 8. 
[0010] An advantage of such an interactive template capturing technique is that the observer is able to select the 
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template with acceptable alignment accuracy. This involves the recognition o1 the human face and the selection of the 
interesting Image regions, such as the eyes regions. Whereas human vision renders this process trivial, such template 
capture would be difficuit for a computer, given all possible types of people with different age, eex, eye shape and ekin 
colour under various lighting conditions. 

s [0011] However, euch an interactive template capturing method is not convenient for regular users because template 
capture has to be performed tor each use of the system. For non-regular users, such as a visitor, there is another 
problem in that they have to ieam how to cooperate with the system. For example, new users may need to know how 
to align their faces with the graphical guide.. This alignment is seemingly Intuitive but has been found awkward for 
many new users. It is therefore desirable to provide an improved arrangement which increases the ease of use and 

10 market acceptability of tracking eystems. 

[0012] In order to avoid repeated template capture for each user, it is possible to store each captured template of 
the users in a database. When a user uses the system for the first time, the interactive method may be used to capture 
the template, whbh is then stored in the database. Subsequent uses by the same user may not require a new template 
as the database can be searched to find his or her template. Each user may need to provide ntore than one template 

IS to accommodate, for example, changes of lighting and changes of facial features. Thus, although this technque has 
the advantage of avokJing the need to capture a template for each use of the display, It Is only practk^i if the number 
of users is very small. Othenivise, the need to bu\\d a large database and the associated long searching time would 
become prohibitive for any commercial Inplementation. For example, point-<3l-sale systems with many one-time users 
would not easily be able to store a database with every user. 

^ [0013] It Is possible to capture templates automatically using image processing and computer visksn technques. 
This Is essentially a face and/or eye detection problem, which forms part of a more general problem of face recognition. 
A complete face recognition system shouti be able to detect faces automatically and identify a person from each face. 
The task of automatic face detection is different from that of identlftoatlon, although many methods which are used for 
identificatkni may also be used for detectbn and vice versa. 

^ [001 4] Much of the computer vision research In the field of face racognitton has focused so far on the identification 
task and examples of this are disclosed In R Brunelli and T Poggio, "Face recognition through geometrteai feature,' 
Proceedings of the 2^ European Conference on Computer Vision, pp. 792-800, Genoa, 1 992; US 5 1 64 992A, M Turk 
and A Pentland. "Elgenfaces for recognition," Journal of Cognitive Neuroscience Vol 3, No 1 , pp. 70-86 and A L Yuille, 
DS Cohen, and PW Hallinam, "Feature extraction from laces using deformabie tempiatss," International Journal of 

30 Computer Vision, 8(2), pp. 99-111 1992. Many ofthese examples have shown a clear need for automata face detectkan 
but the problem and solution tend to be ignored or have not been well described. These known techniques either 
assume that a face is already detected and that its positbn is known in an image or limit the applicatbns to situatbns 
where the face and the background can be easily separated. Few known techniques lor face detectton achieve a 
reliable detection rate without restrictive constraints and long computing time. 

3^ [001 5] DEI 963476B discbses a method of detecting a face in a video picture. The method compares an input image 
with a pre-stored background image to produce a binary mask which can be used to locate the head region, which is 
further analysed with regard to the possibility of the presence of a face. This method requires a controlled background 
which does not change. IHowever, it Is not unusual for people to move around in the background while one user Is 
watching an autostereoscopic d»play. 

<o [0016] G Yang and T S Huang. "Human face detection in complex backgrounds", Pattern Recognilition, Val. 27, No. 
1, pp. 53-63, 1994 d'eclose a method of locating human faces in an uncontrolled background using a hierarchical 
knowledge-based technique. The method comprises three levels. The highertwo levels are based on mosaic Innages 
at different resolutksrw. In the lowest level, an edge detection method is proposed. The system can locate unknown 
human faces spanning a fairly wide range of sizes in a black-and-white picture. Experimental results have been reported 

^ using a set of 40 pklures as the training set and a set of 60 pictures as the test set Each picture has 512x512 pixels 
and albws for face sizes ranging from 48 x 60 to 200 x 250 pbcels. The system has achieved a detection rate of 83% 
i.e. 50 out of 60. In addition to correctly tocated faces, false faces were detected In 28 pictures of the test set. While 
this detectbn rate is relatively tow, a bigger problem Is the computing time of 60 to 120 seconds for processing each 
image. 

so [0017] US 5 012 522 dlsctoses a system which Is capable of locating human faces In video scenes with random 
content within two minutes and of recognising the faces which it locates. When an optk)nai nx)tion detectbn feature 
Is Included, the locatton and recognition events occur In less than 1 minute. The system is based on an earlier auton- 
omous face recognition machine (AFRM) discbsed in E J Smith, "Development of autonorrKius face recognition ma- 
Chine", Master thesis, DocM AD-A1 78852, Air Force Institute of Technotogy, Decennber 1986, with improved speed 

ss and detection score. The AFRM was devetoped from an earlier face recognitton machine by including an automata 
*face finder*, which was developed using Cortical Thought Theory (CTT). CTT involves the use of an algorithm which 
catoulates the 'gestalt' of a ghfen pattem. According to the theory, the gestalt represents the essence or "single char- 
acterisatton" uniquely assigned by the human brain to an entity such as a two-dimensional innage. The face finder 



EP09B4 3a6A2 



works by searching an image for certain facial characleristics or "signatures'. Thelacial signatures are present In most 
facial Images and are rarely present when no face Is present 

[001 B] The most Important facial signature in the AFRM Is the eye signature, which Is generated by extracting columns 
from an image and by plotting the results of gestalt calculated for each column. First an 8 pixel (vertical) by 192 pixel 

5 (horizontal) window is extracted from a 1 28 by 1 92 pixel image area The 8 by 1 92 pixel window is then placed at the 
top of a new 64 by 192 pixel image. The remaining rows of the 64 by 192 pixel image are filled In with a baclcground 
grey level intensity, for Instance 12 out of the total of 18 grey levels where 0 represents biacic The resulting image Is 
then transformed into the eye signature by calculating the gestalt point for each of the 192 vertical columns in the 
Image. This results in a 192-element vector of gestalt points. II an eye region exists, this vector shows a panern that 

10 is characterised by two central peaks corresporKling to the eye centres and a central minimum between the two peaks 
together with two outer minima on either side. If such a signature Is found, an eye region may exist. A similar technique 
is then applied to produce a nose/hnouth signature to verify the existence of the face. The AFRM achieved a 94% 
success rate for the face finder algorithm using a small image database containing 139 images (about 4 to 5 different 
pictures per subject). A disadvantage of such a system Is that there are too many objects in an image whk:h can display 

IS a similar pattem. it is not, therefore, a very reliable face locator. Further, the calculatbn of the gestalts is very computing 
intensive so that It Is diffteult to achieve real time Implementatbn. 

[0019] EP 0 751 473 discbses a technique for beating candidate face regions by filtering, convolution and thresh- 
olding. A subsequent analysis checks whether candidate face features, particularly the eyes and the mouth, have 
certain characteristics. 

20 [0020] US 5 71 5 325 discloses a technique involving reduced resolution images. A location step compares an image 
with a backgrourid Image to define candidate face regions. Subsequent analysis is based on a three level brightness 
image and is performed by comparing each candidate region with a stored template. 

[0021] US 5 629 752 discloses a technique in whk^h analysis Is based on locating body contours In an image and 
checking for symmetry and other characteristics of such contours. This technque also checks for horizontally sym- 

£S metrical eye regbns by detecting horizontally symmetrical dari( ellipses whose major axes are oriented symmetrically. 
[0022] Sato et al, Proceedings of 12 lAPR Intematlonal Conference on Pattem Recognitbn, Jerusalem 6-1 3 October 
1994, Vbl. II, pp. 320-324, "Real Time Facial Feature Tracking Based on IWatching Techniques and its Applicatrans* 
discloses varbus analysis techniques Including detection of eye regions by comparison with a stored template. 
[0023] Chen et al, IEEE (0-8186-7042^) pp. 591-596, 1 995, "Face Dtection by Fuzzy Pattern IWatching" performs 

30 caridldate face iocatbn by fuzzy matching to a lace model'. Candkiates are analysed by checking whether eye/eye- 
brow and nose/mouth regkvis are present on the basis of an undefined 'model". 

[0024] According to a first aspect of the inventksn, there is provkted a method of detecting a human face in an image, 
comprising locating in the image a candidate face region and analysing the candidate face region for a first characteristic 
indicative of a facial feature, characterised in that the first characteristic comprises a substantially symmetrk^al hori- 

3S zontal brightness profile comprising a maximum disposed between first and second minima and in that the analysing 
step comprises forming a vertical integral projection of a portion of the candkiate face region and detennining whether 
the vertical integral projection has first and second minima disposed substantially symmetrically about a maximum.. 
[0025] The beating and analysing steps may be repeated lor each image of a sequence of Inr^ges, such as con- 
secutive fields or frames from a video camera. 

<o [0026] The or each Image may be a colour Image and the analysing step may be periormed on a cok)ur component 
of the colour image. . 

[0027] The analysing step may detemnine whether the vertteal Integral projection has first and second minima whose 
horizontal separation is within a predetermined range. 

[0028] The analysing step may determine whether the vertical integral projection has a maximum and first and second 
^ minima such that the ratto of the difference between the maximum and the smaller of the flrnt and second minima to 
the maximum is greater than a first threshoM. 

P029] The vertteal htegral projectton may be tomned for a plurality of porttons erf the face candWate and the portion 
having the highest ratio may be selected as a potential target inr^ge. 

[0030] The analysing step may comprise forming a measure of the symmetry of the portion. 
so [0031] The symmetry measure may be formed as: 



ss X=0 

[0032] Where V (x) is the value ot the vertical integral projectton at horfzontal position x and is the horizontal 
positbn of the middle of the vertical integral projectbn. 
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[0033] The vertica} integral projection may be formed for a plurality of portions of the face candidate and the portion 
having the highest symmetry measure may be selected as a potential target Image. 

[0034] The analysing step may comprise dividing a portion the candi<^te face region into left and right halves, 

forming a horizontal Integral prelection of each of the halves, and comparing a measure of horizontal symmetiy of the 
s left and ri^t horizontal integral projections with a second threshold. 

[0035] The analysing step may determine whether the candidate face region has first and second brightness minima 

disposed at substantially the same height with a horizontal separation within a predetermined range. 

[0036] The analysing step may determine whether the candidate face region has a vertically extending region of 

higher brightness than and disposed between the first and second brightness minima. 
10 [0037] The analysing step may determine whether the candidate face region has a horizontally extending region 

disposed below and of lower brightness than the vertically extending region. 

[0038] The analysing step may comprise locating, in the candidate face region, candidate eye pupil regions where 
a green image component is greater than a red image component or where a blue image component is greater than 
a green image component Locating the candidate eye pupil regions may be restricted to candidate eye regions of the 
IS candidate lace region. The analysing step may lomi a function E(x,y) for picture elements (x.y) in the candidate eye 
regions such that 



where R, G and B are red, green and blue image components, C, and are constants, E(x,y) = 1 represents a picture 
element inside the candidate eye pupil regions and E(x,y) = 0 represents a picture element outside the candidate eye 
^ pupil regions. The analysing step may detect the centres of the eye pupils as the csntroids of the candidate eye pupil 
regions. 

[0039] The analysing step rmy comprise locating a candidate mouth region in a sub-region of the candidate face 
region which is horizontally between the candidate eye pupil regions and vertically below the level of the candidate 
eye pupil regions by between substantially half and substantially one and half times the distance between the candidate 
^ eye pupil regions. The analysing step may form a function M(Xpy} for picture elements (x,y} within the sub-regions such 
that: 



where R, G and B are red. green and blue image components, ti is a constant, M(x,y) = 1 represents a picture element 
inside the candidate mouth region and M(x,y) = 0 represents a picture element outside the candidate mouth region. 
40 Vertical and horizontal projection profiles of the function M(x,y} may be formed and a candidate lip region may be 
defined in a rectangular sub-region where the vertical and horizontal projection profiles exceed first and second pre- 
determined thresholds, respectTully. The first and secofKl predetermined thresholds may be proportbnal to maxima of 
the vertical and horizontal projection profiles, respectively. 

[0040] The analysing step may check whether the aspect ratio of the candidate lip region is between first and second 
45 predefined thresholds. 

[0041] The analysing step may check whether the ratk3 of the vertical distance from the candnlate eye pupil regkxis 
to the top of the candidate \\p regton to the spacing between the candkiate eye pupil regions Is between first and second 
preset thresholds. 

[0042] The analysing step may comprise dividing a portion of the candidate face region into left and right halves and 
so comparing the angles of the brightness gradients of horizontally symmetrically disposed pairs of points for symmetry. 
[0043] The k)cating and analysing steps may be stopped when the first characteristb is found r times in R consecutive 
images of the sequence. 

[0044] The locating slap nr^y comprise searching the image for a candk^ta face regton having a second character- 
istic Indicative of a human face. 
55 [0045] The second characteristic may be uniform saturation. 

[0046] The searching step may comprise reducing the resolution of the image by averaging the saturatbn to form a 
reduced resolution Imageandsearchlngforaregbnofthereduced resolution imagehaving, in a predetermined shape, 
a substantially uniform saturation which is substantially different from the saturatton of the portbn of the reduced res- 
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olutbn image surrounding the predetermined shape. 

[0047] The Image may comprise a plurality of picture elements and the resolution may be reduced so that the pre- 
determined shape is from two to three reduced resolution picture elements across. 

[0048] The image nrtay comprise a rectangular array of M by N picture elements, the reduced resolution Image nnay 
s comprise (M/m) by (N/n) picture elements, each of which corresponds to m by n picturB elements of the image, and 
the saturation of each picture element of the reduced resolution Innage may be given by: 

i^O 7=0 

where f (i.j) is the saturation of the picture element of the ith column and the jth row of the m by n picture elements. 
[0049] The method may comprise storing the saturations In a store. 
75 [0050] A uniformity value may be ascribed to each of the reduced resolution. picture elements by comparing the 
saturation of each of the reduced resolutbn picture elements with the saturation of at least one adjacent reduced 
resolution picture element. 

[0051] Each unnonnlty value may be ascribed a first value If 

20 

(max(P)-min(P))/max(P)^T 

where max(P) and min(P) are the maximum and minimum values, respectively, of the saturations of the reduced res- 
olution picture element and the or each adjacent picture element and T is a threshold, and a second value different 
from the first value othenArise. 
[0052] T maybe substantially equal to 0. 1 5. 

[0053] The or each adjacent reduced resolution picture element may not have been ascribed a unlfonnlty value and 
each uniformity value may be stored in the store in place of the corresponding saturation. 

[0054] The resolution may be reduced such that the predetermined shape is two or three reduced resolution picture 

30 elements across and the method may further comprise Indicating detection of a candidate face region when a uniformity 
value of the first value is ascribed to any of one reduced resolution pictu re element, two vertically or horizontal ly adjacent 
reduced resolution picture elements and a rectangular two-by-two array of picture elements and when a uniformity 
value of the second value is ascribed to each surrounding reduced resolution picture element. 
[0055] Detectbn may be indicated by storing a third value different from the first and second values in the store in 

3S place of the corresponding uniformity value. 

[0056] The method nnay comprise repeating the resolution reduction and searching at least once with the reduced 
resolution picture elements shifted with respect to the Image picture elements. 
. [0057] The saturation may be derived from red, green and blue components as 

(max(R,Q,B)-mln(R,Q,B))/max(R.Q,B) where max(R.Q.B) and min(R.Q,B) are the maximum and minimum val- 

<o ues, respectively, of the red, green and blue components. 

[0058] A first image may be captured while illuminating an expected range of positions of a face, a second anage 
may be captured using ambient light, and the second image may be subtracted from the first Image to form the image. 
[0059] According to a second aspect of the invention, there is provided an apparatus for detecting a hunr^an face in 
an in^ge, comprising nr^ans tor locating in the image a candidate face region and means for analysing the candidate 

45 face region for a first characteristic indicative of a facial feature. 

[0060] According to a third aspect of the invention, there is provided an obsen/er tracking display including an ap- 
paratus according to the secorKi aspect of the Invention. 

[0061] It is thus possible to provide a method of and an apparatus for autonnatically detecting a human face in, for 
example, an Incoming video Image stream or sequence. This may be used, lor example, to replace the interactive 
so method of capturing a template as described hereinbefore and as disclosed in GB 2 324 428 and EP 0 877 274, for 
Instance In an initlallsatbn stage of an observer video tracking system associated with a tracked autostereoscopk^ 
display. The use of such technk^ues for automatic target Image capture Increases the ease of use of a vkieo tracking 
system and an associated autostereoscopk: display and consequently increases the conrvnercial prospects for such 
systems. 

ss [0062] By using a two-stage approach in the form of a face k^cator and a face analyser, the face locator enables the 
more computing intensive face analysis to be limited to a number of face candkiates. Such an arrangement Is capable 
of detecting a face in a sequence of vkteo images in real time, for instance at a speed of between 5 and 90 Hz, 
depending on the complexity of the inr^ge content. When used in an observer tracking autostereoscopk; display, the 
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face detection may be terminated aiitomatically after a face is detected oonslstentty over a number of consecutive 
Images. The whole process may take no more than a couple of secotKis and the Initialisation need only be pertormed 
once at the beginning of each use o1 the system. 

[0063] The face locator Increases the reliability of the face analysis because the analysis need only be pertonned 
s on the or each candidate lace regbn located in the or each image. Although a non4ace candidate region may contain 
image data similar to that which might be Indicative of facial features, the face locator irnlts the analysis based on such 
characteristics to the potential face candidates. Further, the analysis helps to remove lalse face candidates found by 
the locator and is capable of giving more precise position data of a face and facial features thereof, such as the mid 
point between the eyes of an obsen^er so that a target Image of the eye region may be obtained. 
10 [0064] By separating the functions of location and analysis, each function or step may use simpler and more efficient 
methods which can be implemented commercially without excessively dennarKling computing power and cost. For 
instance, locating potential face candidates using slcin cobur can accommodate reasonable lighting changes. This 
technique is capable of accommodating a relatively wide range ot lighting conditions and is able to cope with people 
erf different age, sex and skh colour. It may even be capable of coping with the wearing of glasses of light colours. 
7^ [0065] These techniques may use any of a number of modules in terms of computer implementation. Each of these 
modules may be replaced or modified to suit various requirements. This increases the fiexibillty of the system, which 
may therefore have a relatively wide range of applications, such as security surveillance, video and image compression, 
video conferencing, computer games, driver monitoring, graphical user interfaces, lace recognition and personal iden- 
tification. 

20 [0066] The inventnn will be further described, by way o1 example, with relerence to the accompanying drawings, in 
which; 

Figure 1 is a block schematic diagram of a known type of obsen^er tracking autostereoscopic display; 
2S Figure 2 is a block schematic diagram of an obsen^er tracking display to which the present inventbn may be applied; 
Figure 3 is a flow diagram illustrating obsen^er tracking in the display of Figure 2; 

Figure 4 illustrates a typical target image or template which is captured by the method illustrated in Figure 3; 
Figure 5 illustrates the appearance of a display during template capture by the display of Figure 2; 

Figure 6 is a fkyw diagram illustrating a method of detecting a hunrtan face constituting an embodiment of the 

invention; 

3S 

Figure 7 is a flow diagram illustrating a face location part of the method illustrated in Figure 6; 
Figure 8 is a diagram Illustrating a hue-saturation-value (IHSV) colour scheme; 
<o Figure 9 is a diagram lilustrating Image resolutton reduction by averaging in the method illustrated in Figure 7; 
Figure 10 Is a diagram Illustrating calculatkin of uniformity values in the method illustrated in Figure 7; 
Figure 11 is a diagram illustrating patterns used in a face-candidate selectbn in the metiiod Illustrated in Figure 7; 

4S 

Figure 12 is a diagram illustrating the effect o1 diflerent positions ot a face on the metiiod illustrated on Figure 7; 

Figure 1 3 Is a diagram lilustrating a modification to the method Illustrated in Figure 7 for accommodating different 
face positbns; 

so 

Figure 14 is a flow diagram illustrating in mors detail the face analysis stags of the method illustrated in Figure 6; 

Figure 15 is a fk)w diagram illustrating in mora detail a facial feature extractkxn step of the method illustrated in 
Figure 14; 

66 

Figure 16 illustrates an image portion of an eye region and a corresponding vertical integral projeclbn; 
Figure 1 7 illustrates a technique for searching lor an eye signature; 
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Figure 1 8 is a flow diagram illustrating a lurther facial characteristic extraction technique form ing part of the method 
Illustrated In Figure 14; 

Figure 19 Illustrates vertical integral projections of too coarse a step size; 

5 

Figure 20 Illustrates the use o1 horizontal Integral projection profiles lor eliminating false face candidates; 

Figure 21 illustrates detection of a pair of eyes represented as a pair of brightness mriima; 

10 Figure 22 illustrates a nose detection technique; 

Figure 23 is a flow diagram illustrating in more detail a modified facial feature extraction step of the method illus- 
trated in Figure 14; 

Figure 24 illustrates eye pupil and mouth regions with vertical and horizontal integral projections of the mouth 
region; 

Figure 25 Illustrates a technique based on analysing facial symmetry; 

20 Figure 26 is a flow diagram illustrating a technique for terminating the metiiod illustrated in Figure 1 4; 

Figure 27 is a block schematic diagram of an observer tracking display to which the present invention is applied; and 

Figure 28 is a system block diagram ctf a video tracking system of tiie display of Figure 13 for performing the 
^ method of the invention. 

[0067] Like reference numerals refer to like parts throughout the drawings. 

[0068] Figure 6 illustrates In flow diagram form a method of automatically detecting and locating a human face in a 
pixelaled colour image from a vkJeo image sequence. The video image sequence may be supplied in real time, for 
30 Instance by a video camera of the type described hereinbefore with reference to Figure 2. The method Is capable of 
operating In real time as the initialisation stage 9 shown In Figure 3 and supplies a target Image or template to tiie 
tracking stage 10 shown in Figure 3. 

[0069] In a step SI , tiie latest digital Image In tiie red, green and blue (RGB) fomnat is obtained. For instance, this 
step may comprise storing the latest field of video data from the video camera in a flekd store. In a step S2, tiie image 
3S is searched In order to locate regions constituting face candidates. A step S3 determine^ whetiier any face candkiates 
have been found. If not, the step SI is performed and the steps S2 and S3 are repeated until at least one face candidate 
is found in the latest image. The steps S2 and S3 therefore constitule a face locator 17 which will be described In mora 
detail hereinafter. 

[0070] The or each face candidate is then supplied to a face analyser 18 which analyses tiie face candidates to 
40 determine tiie presence of one or more characteristics Indicative of facial features. A step S4 receives the portions of 
the image, one-by-one, corresponding to the face candidates bcated by the face locator 17. The step S4 analyses 
each face candidate and, If It determines that tiie candidate contains characteristics Indicative of a facial feature, ex- 
tracts a target image or template in the form of the eye region illustrated at 11 In Figure 4 from the latest image supplied 
from the step Si . A step S5 determines whether alt of the face candidates have been tested and the step S4 is repeated 
^ until all the candidates have been tested. A step S6 determines whether any templates have been obtained. If not, 
control passes to the step S1 and the procedure is repeated for the next cokxjr image, if any template has been 
obtained, the or each such template is output at a step S7. 

[0071] The face locator 17 rr^y be of any suitable type and a manual technique for face location is described here- 
inafter. [However, a suitable automatic technique is disclosed in GB 2 333 590 and EP 0 932 114 and this is described 
so In detail with reference to figures 7 to 1 3. 

[0072] In a step S21 . tiie vkieo image is converted f nsm tiie RGB format to tiie HSV (hue-saturation-value) format 
80 as to obFtaln the saturatkin of each pixel. In practtee, It Is sufficient to obtain the S component only In tiie step S21 . 
[0073] The RGB format is a hardware-orranted colour scheme resulting from the way in which camera sensors and 
display phosphors work. The HSV format Is closely related to the concepts of tint, shade and tone. In tiie HSV format, 
hue represents cotour as described by tiie wavelengtii of light (for instance, tiie distinction between red and yellow). 
saturatkMi represents the amount of colour that is present (for instance, the distinction between red and pink), and 
value represents tiie an^ount of light (for instance, the distinction between dark red arKi light red or between dark grey 
and light grey). The "space " In which these values may be pbtted can be shown as a circular or hexagonal cone or 
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double cone, for instance as illuslrated in Figure B, in which the axis of the cone is the grey scale progression Irom 
blackto white, distance from the axis represents saturation and the direction or angle about the axis represents the hue. 
[0074] The colour o1 human skin is created by a combination o1 blood (red) and melanin (yellow, brown). Skin colours 
lie between these two extreme hues and are somewhat saturated but are not extremely saturated. The saturatton 
5 component of the human lace is relatively uniform. 

[0075] Several techniques exist for converting video image data from the RGB lormat to the HSV fornrat. Any tech- 
nique which extracts the saturation component may be used. For Instance, the conversion nriay be performed In ac- 
cordance with the following axpresslon for the saturation component S: 

10 S= 0 formax(R,G,B)=0 

S=(max(R,G,B)-min(R,G.B))/max(R,G.B) othenwse 

[0076] Following the conversion step S21 , the spatial image resolution of the saturatksn component is reduced by 
75 averaging in a step 822. As described harainbefora with reference to Figure 2, the approximate distance of the iace 
of an observer from the display is known so that the approximate size of a face in each vkleo Image is known. The 
resolution is reduced such that the face of an adult observer occupies about two to three pixels in e^ch dimension as 
indicated in Figure 7. A technique for achieving this will be described in more detail hereinafter 
[0077] A step S23 detects, inthereduced resolution image from the step S22, regkMisor "btobs" of uniform saturation 
20 of predetermined size and shape surrounded by a region of reduced resolution pixels having a different saturation. A 
technique for achieving this is also described In more detail hereinafter. A step S24 detects whether a lace candidate 
or face-like region has been found. If not, the steps SI to 824 are repeated. When the stop 824 confirms that at least 
one candidate has been found, the position of the or each uniform blob detected in the step S23 is output at a step S25. 
[0078] Figure 9 illustrates the image resolution reduction step 822 in more detail. 30 illustrates the pixel structure of 
2S an irnage supplied to the step 81 . The spatial resolution is illustrated as a regular rectangular array of MxN square or 
rectangular pixels. The spatial resolution Is reduced by averaging to give an array of (M/m)x(rvl/N) pixels as illustrated 
at 31 . The array of pixels 30 Is effectively divided up into "windows' or rectangular bkxsks of pixels 32, each comprising 
mxn pixels of the structure 30. The S values of the pixels are indfcated in Figure 9 as f(l,j), for 0^ km and 0£ j<n. The 
average saturation value P of the window is calculated as: 

30 

P = {\/mn) z 

3S 

[0079] In the embodiment illustrated in the drawings, the reduction in spatial resolution is such that an adult obsenrer 
lace occupies about two 1o three of the reduced resolution pixels in each dbnenskin. 

[0080] The step S23 comprises assigning a unifomnity status or value U to each reduced resolution pixel and then 
40 detecting patterns of unllonmlty values representing face-like regions. The uniformity value is 1 or 0 depending on the 
saturatbns of the pixel and its neighbours. Figure 10 illustrates at 35 a pixel having an averaged saturation value Pq 
and the averaged saturation values P^. Pgand Pgof the three neighbouring pixels. The assignment of uniformity values 
begins at the top left pixel 37 and proceeds from left to right until the penultimate pixel 38 crl the top row has been 
assigned its unifomnity value. This process is then repeated for each row in tum from top to bottom ending at the 
45 penultimate row. By "scanning* the pixels in this way and using neighbouring pixels to the right and below the pixel 
whose uniformity value has been calculated, it Is possible to replace the average saturation values P with the uniformity 
values U by overwriting so that memoiy capacity can be used efftelently and It is not necessary to provkJe further 
memory capacity for the uniformity values. 
[0081] The uniformity value U is cabulated as: 

so 

U = 1 if (fmax-fminyfmax^ 
U = 0 otherwise 

55 where T is a predetermined threshold, lor Instance having a typteal value of 0.16. fmax Is the maximum of Pq, Pi. P2 
and P3 and fmin is the minimum of Pq, Pi, P2 and P3. 

[0082] When the ascribing of the uniformity values has been completed, the array 36 contains a pattern of Os and 
Is representing the uniformity of saturation of the reduced resolution pixels. The step 823 then looks for specific patterns 
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erf Os and Is in order to detect face-lika regions. Figure 11 illustrates an example oi four patterns o1 uniformity values 
and thecorresponding pixel saturation patterns which are like th© lace candidates In the video Images. Figure 11 shows 
at 40 a uniform blob in which dark regions represent averaged saluratron values of sufficient uniformity to indicate a 
face-like regton. The surrounding light regions or squares represent a region sunounding the unltonn saturation pixels 
5 and having substantially different saturatbns. TTie corresponding pattern o1 uniformity values is illustrated at 41 and 
compresses a pixel locatton with the unifomnity value 1 completely surrounded by pixel locations with the unitonmlty 
value 0. 

[0083] Similarly, Figure 11 shovire at 42 another face-like region and at 43 the corresponding pattern of unilormily 
values. In this case, two horizontally adjacent pixel locations have the uniformity value 1 and are completely surrounded 
10 by pixel locatfons having the unifomnrty value 0. Figure 11 illustrates at 44 a third pattern whose uniformity values are 
as shown at 45 and are such that two venically adjacent pixel tocattons have the uniformity value 1 and are surrounded 
by pixel locatrons with the uniformity value 0. 

[0084] The fourth pattern shown at 46 in Figure 1 1 has a square btock o1 tour (two-by-two) pixel locations having the 
uniformity value 1 completely surrounded by pixel locations having the uniformity value 0. Thus, whenever any ot the 

IS uniformity value patterns illustrated at 41 , 43, 45 and 47 in Figure 11 occurs, the step S23 indicates that a face-like 
region or candidate has been found. Searching lor these patterns can be performed efflclently. For Instance, the uni- 
formity values of the pixel locations are checked in turn, for iristance left to right in each row and top to bottom of the 
field. Whenever a uniformity of value of 1 1s detected, the neighbouring pixel locations to the right and below the current 
pixel location are inspected. If at least one of these uniformity values is also 1 and the region is surrounded by unltormity 

20 values o1 0, then a pattern con-esponding to a potential face candidate is found. The corresponding pixel locations may 
then be marked, tor Instance by replacing their uniformity values with a value other than 1 or 0, for example a value of 
2. Unless no potential lace candidate has been found, the positrans of the candidates are output. 
,[0085] The appearance of the patterns 40, 42, 44 and 46 may be affected by the actual position of the face-like regton 
In relation to the stmcture of the reduced resolution pixels 36. Figure 12 illustrates an example of this for a face-like 

25 region having a size of two-by-two reduced resolution pixels as shown at 49. If the face-like region indicated by a circle 
50 is approximately centred at a two-by-two block the pattern 47 of uniformity values will be obtained and detection 
will be correct. However, if the face were shifted by the extent of half a pixel on both the horizontal and vertical diractton 
as illustrated at 51, the centre part of the lace-like region may have a uniformity value which id different from the 
surrounding region. This may result in failure to dated a genuine candidate. 

30 p086] In order to avoid this possible problem, the steps S21 to 824 may be repeated for the same video field or for 
one or more succeeding video fields of image data. However, each time the steps 821 to 824 are repealed, the position 
of the array 31 of reduced resolution pixels is changed with respect to the array 30 of the colour image pixels. This is 
illustrated In Figure 13 where the whole Image Is Illustrated at 52 and the region used for spatial resolution reduction 
by image averaging is indicated at 63. The averaging is performed in the same way as illustrated in Figure 9 but the 

35 starting position is changed, in particular, whereas the starting position for the first pixel in Figure 8 is at the top left 
comer 54 of the whole image 52, Figure 13 illustrates a subsequent averaging whore the starting position is shifted 
from the top left comer by an amount Sx to the right in the horizontal diractton and By downwardly in the vertical 
direction, where: 

0<Sx<m and 0<Sy<n 

40 [0087] Each image may be repeatedly processed such thai all combinations of the values of Sx and Sy are used so 
that mxn processes must be pariormed. However, in practice, it is not necessary to use all of the starting posittons, 
partlcularty In applications where the detection of lace-like regtons does not have to be very accurate. In the present 
example where the face-like regton detection fomris the first step ol a two step process, the values of Sx and Sy may 
be selected from a more sparse set of combinattons such as: 

45 Sx = lx(m/p)and8y = |x(n/q) 

[0088] Where i,j,p and q are integers satisfying the following retattonships: 

OS*<p 
0^<q 

so i^<m 
1Sq<n 

[0089] This results in a total of pxq combinations. 

[0090] As menttoned hereinbefore, the steps S21 to 824 may be repealed with the different starting positions on the 
56 same image or on a sequence of images. For real time Innage processing, it may be necessary or preferable to repeat 
the steps for the images of a sequence. The method may be perfomned very quickly and can operate in real time at 
between 10 and 60 Hz rate depending on the number erf face candidates present In the lnr«ge. Thus, within a short 
period of the order of a very lew seconds or less, all possible positions can be tested. 
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[00911 The method illustrated in Figure 7 may be perfomned on any suitable hardware, such as that Illustrated in 
Figure 2. The tracking processor 4 as described herelnbetore Is capable of being programmed to Implement the method 
of Figure 7 as part of the initialisatfon stage 9 shown in Figure 3. The data processing is performed by the R4400 
processor and aswiaied memory andtheprDcessor4lncludesavldeodlgltlserandfra^^ Illustrated In Figure 
2 tor storing the saturation values, the averaged saturation values of the reduced resolution pixels and the uniformity 
values. 

[0092] Figure 14 Illustrates the face analyser 18 In more detail. In particular, the analysis formed In the step S4 Is 
shown as steps S10 to S14 in Figure 14. 

[0093] Although the analysis may be performed in the domain, it is sufficient to make use of a single cotour 
component. Accordingly, the step SIC selects, lor example, the red coteur component Irom the latest cobur image. 
As an alternative, another slngle^ralue component may be used For example, a contrast Image may be derived using 
the equation: 



C = max (R.G,B) - min (R,G, B) 

[0094] The use of such a contrast image may improve dstectron of the omega-shape as described hereinafter. 
[0095] The step S11 selects one of the face candidates provided by the face locator 17 and selects the image area 
of the red component specified by the face candidate. The step SI 2 extracts facial features to confirm the existence 
of a face In the Image and to obtain the precise position of the lace. The step SI 3 determines whether a face is found 
and. if not. passes control to the step S5. If a face has been found, the step SI 4 selects or updates the target image 
in the form of an eye template, such as that shown at 11 in Figure 4. Control then passes to the step S5. The steps 
811 to S14 are repeated until all face candidates have been tested. 

[0096] It is possible for this method to detect more than one face in an image. However, in certain applications such 
as current obsewertracklngautostereoscopicdisplays, only a single user is permitted. If more than one face is detected, 
a selection rule may be used to select a single template. For example, the selected template may be the first one to 
be detected or maybe the one that Is positioned nearest the centre of the Image. As an alternative, each template may 
be given a quality measure, for instance relating to the degree erf symmetry, and the template with the best quality 
measure may be selected. Such a technique is described in mora detail hereinafter 

[0097] The extractkjn of facial features forming the step SI 2 Is shown In more detail In Figure 15 and comprises 
steps S30 to S39. In the step S30, a regton of the red component image o1 the required template size is selected. The 
step S31 detects whether an omega-shape has been delected and. If so. the position of the detected omega-shape 
based on symmetric measure is stored or updated in the step S32. The step S33 determines whether all possible 
positions have been tested and, if not, the step S30 selects another region from the image area specified by the face 
candidate. 

[0098] Once all possible positions have been tested, the step S34 deteimines whether any omega^hape vertical 
integral projectton has been detected. If so, the step 836 determines whether two eyes exist in the template-sized 
region, if so, the step S36 determines whether a nose has been detected. If so. a step S^ sets a flag to indicate that 
a face has been detected and stores the position of the face. If any of the tests In the steps 834 to 836 is negatlvCi 
the step S37 sets a flag to indicate that no lace has been detected. The step S39 finishes analysis of the face candidate. 
[0099] Figure 16 illustrates the desired eye region template 11 and shows, betow this, the corresponding vertical 
integral pro|ectlon profile which resembles the "©■. The step 831 detects such profiles, which are characterised by a 
peak or maximum brightness \fo at a horizontal position with first and second minima of brightness VI and V2 
located symmetrically about the maximum at X, and Xg. The required template or target image size Is illustrated at 21 
in Figure 1 7 and comprises k by /pixels. The image area of a face candidate comprises K by L pixels and is illustrated 
at 22 The step S30 selects an initial region 23 of the required template size for analysis and the steps S31 to S33 are 
performed. The step 830 then selects a horizontally adlacent regton 24 which Is displaced to the right relative to the 
region 23 by a distance Sx. This is repeated until the selected regfons have covered the lop strip of the image area 
22 The process Is further repeated with a vertical displacement Sy from a starting position indfcated at 25. Thus, each 
horizontal strip is "covered" by horizontally overiapping regions and the whole of the area 22 is covered by vertically 
overlapping strips until the selected region is kscated at 26. The step 833 determines that all possible positions have 
been tested and the step S34 Is then performed. 

[0100] The function of the step 831 is illustrated in more detail by steps S40 to S48 in Figure 1 8. The step S40 selects 
the subsection of the Image of width kpbcels. The parameter k Is chosen sothal, withihe relative horizontal displacement 
Sx each strip Is covered by overiapping rectangles. Similariy. the parameters / and Sy are selected to give vertically 
overlapping strips. In general, these parameters are selected so that Sx Is equal to k/4 and Sy is equal to 1/4. 
[0101] The step S41 calculates the vertical projectkjn functton V(x). This is calculated as: 
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V(x)= lJ(.x,y) 

s 

whoro l(x.y) Is the Intensity of the pixel at coordinates x, y and the area of the subsection infwge is given by (x1 ,x2)x 
(y1 ,y2). The step S42 then detects the peak or nrtaximum erf this functlori and finds the horizontal posltfon Xd- 
[0102] The step S43 determines whetherthe position o1 the maximum is within the central region of the subsection, 
which is defined as the region from m to 3W4. II not, control returns to the step S40, Othenwlse, the step S44 detects 
10 the minima on both sides o1 the peak or maximum and finds their positions and Xg. Tbe step S44 then detects 
whether the locations of the minima correspond to the eye separation of an adult. This eye separation Is normally 
between 55 and 70 cm and the corresponding thresholds are T^ and T2. H the magnitude 0I the difference between 
X, and Xa lies between these thresholds (step S45), the step S46 Is performed. Otherarisa, control retums to the step 
S40. 

IS [0103] The step S46 fomrw the pealc-to-valley ratio R In accordance with the expression: 

R=^-mn(V(X^),V(X^))/V(X^ 

20 [0104] The step S47 compares the ratio R with a threshold T3, for which a typical value Is 0.2. If the ratio Is below 
this threshold, control returns to the step S40. If the ratio exceeds the threshold, the step S48 indicates that an omega 
shape has been, detected. 

[0105] When an omega shape has been detected, a quality measure which is related to the degree of horizontal 
symmetry about a centre line of the subsection is calculated. For example, this may be calculated as: 

2S 

x^o 

30 

[0108] The Quality measure n^y be used to select the "besf omega shape lor the current face candidate and, In 
particular, to detennine the best horizontal and vertical position of the eye region, although the vertical position may 
be detemnined as described hereinafter. 

3S [0107] Figure 1 9 Illustrates the effect of Inappropriate choice of the horizontal step size Sx. In particular, if Sx is set 
to a large value, lor example greater than W2, it is possible that a peak or maximum will not be detected in any sub- 
secttan. As shown In the vertical Integral projection profile in Figure 1 9 and, in particular, in the shaded parts, there is 
no maximum or peak within the central region so that the step S42 would find a position X^ which is outside the range 
In the step S43. The size of the step Sx should therefore be smaller than W2 and a value of k^4 has been found to worti 

40 well In maintaining computing efficiency while avoiding missing the central peak of the omega shaped profile. 

[0108] The peak of the best omega shape, for instance having the highest quality measure Q. indicates the middle 
of the two eyes of the eye reg bn and defines the centre posltton of the target Image or template. However, the vertkal 
position is not well defined because subsections displaced slightly upwards or downwards from the best position are 
likely to display similar omega shaped vertical Integral pn^jectton profiles. 

45 [0109] One technique for vertically centering the subsection on the eye region involves beating the best horizontal 
position and then displacing the subsection upwardly and downwardly until the omega shape can no bnger be detected. 
A vertical position mid way between these upper and lower limit posittons may then be selected as the vertical position 
for the target image. 

[Oil 0] An alternative technique for locating the correct vertical positton is based on the peak^o-valley ratto. in this 
so case, the best horizontal positbn is again detennined and the subsections are dispbced vertically while monitoring 
the peak-to-valley ratto. The position corresponding to tlie highest ratio Is then selected as the vertical posltfon of the 
middle of the target Image. 

[Oil 1] Although the existence of an omega shape in the verttoal integral projactton profile is a strong indicatton that 
an eye region exists, this is based largely on the assumption of symmetry of a hunnan face. However, an Image whteh 
ss is ndt symmetrical with respect to its centre line may also produce an omega^haped profile. An example of such an 
image is Illustrated at the middle of Figure 20 directly above an eye regton and the vertteal profile for both Images Is 
substantially the same and is illustrated at the top of Figure 20. In this case, the non-symmetrical image is obtained 
by reflecting the left half of the image about the centre line and then turning the resulting right haH image upskie down. 
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[0112] In order to avoid false face detections caused by such images, a technique based on horizontal integral pro- 
jection profiles may be used. In particular, when an omega shape has been detected and an Image area of the desired 
template size is selected such that its centre Is aligned with the central peakor maxim um of the omega shape, horizontal 
Integral projections are applied lo the left and the right haVes of the Image. The horizontal integral projection profile 
for the left half is given by: 



10 



Xo 

jc = 0 



IS 



[0113] And the horizontal integral projection profile for the right half is given by: 



Xo 
x=o 



20 [Oil 4] A eymmetrical measure Sm is then defined as: 



2S 



I 



y--y\ 



[0115] The minimum and maximum values of Sm are 0 and 1 . The value of Sm should not exceed a predetermine 
threshold, which is typically between 0. 1 5 and 0.2. By accepting an omega shape only if it passes this test, the chances 

^0 of false detection are reduced. 

. [Oil 6] The horizontal integral projection profiles tor the two Images are illustrated in Figure 20. The false image gives 
horizontally asymmetrical profiles whereas the image of the eye region gives substantially symmetrical profiles. This 
technique may be Inserted between the steps S47 and 846 In Figure 18 such that a positive result passes control to 
the step S48 whereas a negative result passes control to the step 840. 

35 [011 7] Detection of the omega shape reduces the chances of false face detection but further tests may be performed, 
for instance as illustrated by the steps S35 and 836 in Figure 15, so as to reduce stilt further the chances of false 
detections. Detection of the omega shape allows the middle of a face to be located, assuming that a face Is present 
In the Image. The eye regions are usually dart(er so that two brightness minima should be present and should be 
substantially horizontally symmetrically disposed with respect to the middle line. This may be tested with respect to 

40 the RGB domain but does not need to be applied at the full Image resolution, in fad, a lower image resolution may 
have the advantage of reducing the chances of any isolated dark pixel from being taken as the minimum corresponding 
to an eye. 

[0118] Although the head of a user will normally be in a substantially upright position during tiie inlt^llsatbn stage, 
an absolutely upright positbn is not essential. Thus, the two minima do not necessarily lie on the same horizontal line. 
^ It is therefore useful to reduce die image resolution by averaging, for Instance as described hereinbefore. A single 
colour component image, such as the red component Image, is sufficient tor this purpose. A suitable resolutbn for this 
test Is such that the target image contains only a few pixels In each dimenston, for example 5 by 5 or 7 by 7 pixels. As 
illustrated in Figure 21 , the locations of the minima are represented as (\, YJ and (Xp.Xf)). The step 835 determines 
whether 

so 

and . 

55 

ix^ + Xfl-2;^i^ r/ 
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where Xq is the centre position and T4 is a threshold, tor instance having a value of 1 . 

[0119] If the step S35 confirms the existence of two eye regions, the likelihood that these regions correspond to 
actual eyes in the image is enhanced if a brighter region is detected between the minima. A typical nose pattern is 
illustrated in Figure 22 and represents the obsenratlon that the nose Is usually brighter than the Image just below the 
5 tip of the nose. The nose region as shown in Figure 22 should have a length of two or three pixels depending on the 
actual size of the face. In this case, the rtose region Is accepted 11 the following conditions are satisfied: 
mln (Pi , Pg, P3)/max(Pi , P2, Pz)^Ts 
and 

mean(P4, PgiPeVmeanCPi .Pg. P^^e 

w where T5 and Te are predetermined thresholds and typically have values of 0.6 and 0.5. respectively. 

[01201 The above methods lor detecting eyes and nose are carried out In lower resolution to In^rove computing 
efficiency. Other facial feature extraction methods may be appi ied to further verify the presence of a face. For exanrple, 
. the loilowing methods describe the detection of eye pupils and mouth lips using the original full resolution RGB image. 
Figure 23 illustrates another embodiment of the step SI 2 ot Figure 1 4 in that steps S60 and S61 are added. The step 

IS 860 performs a high resolution detection 0I aye pupils and rnoulth and the step 861 perfomns a geometrical constraints 
test, both of which are described In mote detail hereinafter. 

[0121] The precise position of each eye may be identified as the centre of the eye pupil. The first step I0 determine 
the centre of the eye pupil is lo segment the eye pupil from the rest of the eye region and the face skin. It has been 
found that the following inec^uality holds for the pixels over the eye regk)n except those of the eye pupil: 
20 R>G>B 

[0122] The folk)wlng equation is used to detect the eye pupil: 



25 



35 



40 



56 



fO^-G>C, andG-B>Ci 



where the value 1 denudes a pixel inside the eye pupil region and 0 a pixel outside, and where and C2 are two 
constants. Typcial values ot these two parameters are given by: 
30 Ci = C2 = 0 

[0123] The Initial best eye template position is given by the k)catk2n where the best omega-shape Is detected as 
described earlier. The peak position of the omega-shape divides this region into two halves. The above eye pupil 
detection method may then be applied to each half separately. The eye positions are then defined as the centroids of 
the detected eye pupils. For exannple, the left eye positron is given by: 



" y=yi x=x\ 



and: 



where N is the total number of pbcels in the area whose top-left comer is at (x^.y-i) and whose bottom-right comer is 
so at (Xs.ya). The position (Xl.YJ then defines the centre erf the left eye pupil. Similarly the position of the right eye pupil 
can be determined as (Xr,Yr) This is illustrated in Figure 24. The aye separation is then given by: 



[0124] If the eye pupils are detected, the mouth nnay then be kxated whhln the rectangular area A'B'CD' as Illustrated 
in Figure 24. The left side of this area A'B'CD' is determined by the position of the left eye pupil and the right skia by 
that of the right eye pupil. The top side of the area is tocated O.SD^ bstow the line linking the two eye pupils and the 
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TO 



bottom side is located 1 .SD^ye below the same line. 

[0125] The detection ot the mouth Is achieved by detecting the lips. The lips are segmented Irom the ftice using the 
lollowing equation: 

{\R>G>BandR<7jG 
[0 otherwise 

where the value I denotes a lip pixel and 0 a skin pixel, and where n Is a constant whose typical value is set to 2.5. 
[0126] A vertical histogram is then constmcted using the lolbwing equation: 

20 . [0127] This is illustrated in Figure 24. If the mouth does exist, the above histogram would usually produce a peak at 
the centre and decrease gradually on both sides. If a peak Is detected at posltton Xp, the left end of the mouth Is given 
by the first at which the value o1 histogram satifias the lollowing inequality: 

Hy(Xl)<^Hy(X^ 

[0128] Where p. is a constant and is typically set to 0.1 . The right end of the mouth is similarly determined as Xg. 
^ [0129] The height of the mouth is determined similarly using a horizontal projection profile of M^,y). This gives the 
top position of the mouth as Yi and the bottom as Yg. The mouth Is therefore enclosed by the rectangle whose top- 
Islt corner is (X^ Y^) and whose bottom right comer is (X2, Y2). 

[0130] If a mouth Is present, Its aspect ratio should satisfy the following geomnrtetrical constraints: 



30 



3S 



40 



a< 



where a is typically set to 1 .5 and p to 5. 

[0131] The vertical distance between the top of the mouth and the tins linking the two eyes Is defined as: 



[0132] The value of Y2, that Is the posltton of the tower lip, changes more signlfkantly than the value of Y^ , that is 
the positron of the top lip, in particular when the user is talking. . In the above equation, Y^ has been used to indicate 
the mouth position in the vertical direction. 
4S [0133] It has been found that this distance is proportional to the eye separation, with a typical ratio of 1 . The relative 
position of the mouth and the eyes therefore should satisfy the toltowing condition: 



so 



-1 <V 



where v is typically set to 0.25. The step S61 checks whether these geommetrteaJ constraints are satisfied. 
ss [0134] A further measure ot symmetry may be based on a comprehensive symmetry detector as disclosed in D 
Relsteld. H Wolfson and Y Yeshumn, "Context free attenttonal operators: the generalized symmetry transtomis", IJCV, 
vol 14, pp. 119-130, 1995, and D Relsfeldand Y Yeshunjn, "Robust detection of facial features by generalized symmetry, 
' Prxwof the 11* lAPR International Conference on Patlem Recognition, pp. 117. Such a comprehensive arrangement 
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is Impratical lor commercial implamentalion of the present method but a substantially simplified technique may be used 
to provide a measure of symmetry which assists In confirming the presence of a lace or pari of a face In a template. 
[0135] Figure 25 illustrates a side-lit inr^ge of a person and a rectangular area ABCD containing a subsection o1 the 
Image. The subsection Is divided Into a left half AEFD and a right half EBCF. For any point In the right half, there Is 

5 a corresponding point in the left half in the "min-or image" posilion. If the subsection ABCD contains a target which 
Is symmetrical with respect to the middle line ER the points P, and Pg lomi a pair of symmetric pohts. 
[01361 absolutely uniform Illumination, the brightness or Intensity of these two points would be Identical. However, 
as illustrated in Figure 25, typical lighting conditions are such that the intensities of symmetric points are different. 
[0137] This problem may be overcome by.using "Image gradients', which are vectors describing the Intensity change 

10 at each point In particular, each such vector has a magnitude equal to the maximum change in intensity from the point 
in any direction and a direction or angle such that the vector points In the direction of maximum Itenslty change. The 
gradient amplitude is also affected by the type of illumination but the phase angle is largely dependent on the geometric 
features of the face and is less affected by illumination. Thus, the points and P2 are regarded as symmetric if their 
gradient angles 6-, and satisfy the following condition: 



75 



20 



2S 



6^ + 62 = ± ic 

[0138] The symmetric measure of the subsection ABCD is given by: 



& = 5] {1 - cos[i9i(jc, y) + 02{x' , y ) 



{x,y)eEBCF 
\x,y)eAEFD 



where (x,y) and (x'.y') are the coordinates of the point pairs In the two halves of the Image subsection. 
[0139] This measure Ss may be calculated for any subsection in the Image by searching from left to right and top to 
bottom. The section having the highest value of Ss is then selected as the area containing the image face. 
30 [0140] The measure Ss may be further refined In accordance with the following expression: 



Ss = 2^{l-cos[£?i(x,>^) +^2(x^y)]H<^,y)H<J^'y•)} 

(x,y)^EBCF 
U^y')^AEFD 



40 where w(x,y) and w(x',y') are weight functions. For instance, the weight functions may be the gradient amplitude at 
each point so that strong edges contribute more to the value of Ss. In practice, a binary weight function may be used 
and may be formed by thresholding the gradient annplltude such that, If the gradient amplitude exceeds a given thresh- 
old, the weight function is set to 1 and, othenvise, the weight function Is set to 0. The threshold may be made equal to 
half of the mean value of the gradient amplitude a! the subsection. 

45 [0141] It is desirable that the target inr«ge be captured from an upright posilion of the face. When, for example, a 
user sits down in front of a display and starts to look at the display, the system starts to locate the face and find the 
target image. The first target Innage which Is detected n«y not be the best as the user may not be In the upright position. 
Thus, it rray not be appropriate to select the first detected target Image as the template, for instance for subsequent 
observer tracking. 

50 [0142] Figure 26 illustrates a modified method based on that illustraled in Figure 6. In particular, steps S50 to S53 
are Inserted between the steps 86 and 87. VVhen a template is found In the step S6, the step S50 calculates the 
measure of the "goodness* of the lnr«ge contained In the template. For Instance, this measure may be based on the 
symmetric measure Ss as described hereinbefore. The step S51 determines whether the template has been found in 
the last R Images or frames. If not, control retums to the step SI . If so. the step S52 conrpares the goodness measure 

55 of the most recently detected template with the prevbus bast template. If Ihe most recent template has a higher good- 
ness value. It Is selected as the current best template. 

[0143] The step S53 determines whether templates have been found more than r times in the last R frames. If not, 
control returns to the step SI . If so, the step S7 outputs the besttemplate i.e. that having the highest goodness measure. 
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[0144] The method illustrated in Figure 26 thus deleimines whether more than r templates have been delected in 
the last R consequtlve frames. For Instance, r may be equal to 7 and R nriay be equal to 10. II this Is the case, the 
target image is regarded as detected consistently and the best template is used for subsequent observer tracking. 
[014S] It Is possible for the face locator Illustrated In Figure 7 to be replaced by a semlnautomatic method requiring 
5 some user assistance. For example, if a black-and-white video camera is used, cotour informatton wouW not be avail- 
able 80 thai the face locator illustrated in Figure 7 may no longer work. 

[0146] In the seml-automatte method, each incoming video Image is displayed with a graphtes head guide about the 
same size as an adult head in the centre of the display The user sees a liva image sequence o1 himself with a fixed 
graphics guk^e so that he can position his head within the guide. The face analyser 18 is applied to the region within 
10 the graphics head guide and, once the head of the user is disposed within this ragkMi. detects the face and locates the 
precise position of the target image. There Is no requirement for the user to have an accurate alignment, whfch is an 
inconvenient requirement in the method disclosed In GB 2 324 42fl and EP 0 877 274. Also, the possibility of detecting 
falsa targets in the background is reduced because the face analyser 18 searches only in the area specified by the 

head guide. u _i * 

IS [01471 If ^he lighting is very poor, for instance with extremely biased lighting, the sami^automatic method may not 
work reliably. In this case, the decision to accept the template may be left to the user rather than to the apparatus 
performing the method. For instance, this may be achieved by dispteying a graphics overlay on top of the displayed 
image o1 the user after the target image Is found. The user can-see the positton of the target Image and can decide 
whether to accept the template. . 
20 [0148] The difference between this method and the manual method is that the user does not need to make a special 
effort to align his head with the "overlay" graphics In order to select the template. Instead, the computer suggests the 
template and. if this is correct, the user need only signal acceptance, for instance by pressing a button or key. Otherwise, 
the system may revert to the manual mode. This arragement ensures that a reliable template Is always available to 
make successful tracking possible. 
25 [0149] In ambient lighting where the face receives approximately the same illumination on both sides, detection of 
the omega shape in the vertical integral pro|ecllon profile worte well. However, when illumination Is strongly biased to 
one side of the face, this technique may be less reliable but can be improved by supplying modified image data to the 
step S4 in Figure 6 as follows. 

[0150] The image in the imago area is 'mirrored" or reversed horizontally about the vertical centre line and then 
30 added back to the original Image. In the Ideal case where the face Is geometrically symmetrical and the centre line is 
In the mWdle of the face, a resulting Inrvage o1 the face with symmetrical Illumination on both sWes is produced. The 
vertical integral projection pr<rfile of such an image then has an omega shape which is also symmetrical and the process- 
ing steps as described hereinbefore with reference to the face analyser 18 nr«y be used on the modified Image data 
[0151] The initial line of symmetry which is chosen may noi be at the centre of the face. This technique may therefore 
3S be applied Iterativeiy such that the detected peak In the profile is used as the mirror point, resulting In a better omega 
shape with a more precise peak position. This maybe repeated until the peak position does not substantially change. 
Typically, such an iterative procedure requires fewer than ten iteratbns. 

[0152] The method described hereinbefore worics well with uniform lighting including ambient lighting and Is appli- 
cable to applications under poor lighting conditions by using an active light source. Although the method does not 
40 require any special lighting and Is very resilient to changes In the lighting of an obsen/er. an active light source may 
be used during the initialisation stage 9 of Figure 2 and than switched off during subsequent obsenrer tracking, which 
Is highly robust and does not require special lighting. 

[0153] Figure 27 shows a display ol the type shown In Figure 2 modified to provide active lighting. The active liflht 
source comprises a flash light 56 with a synchroniser controlled by the processor 4. The flash light 55 is disposed in 
45 a suitable position, such as above the display 7 and adjacent the senor 3, for Illuminating the face of an observer. 
[0154] Figure 28 Illustrates the video tracking system 2 and specifically the data processor 4 in more detail. The data 
processor comprises a central processing unit (CPU) 58 connected to a CPU bus 67. Asystem memory 58 Is connected 
to the bus 57 and contains all of the system software for operating the data processor. 

[015S] The video camera 3 is connected to a video digitiser 59 which is connected to a data bus 60, to the flash light 
so with synchroniser 55. to the CPU 56 and to an optional video display 61 when provided. A frame store 62 Is connected 
to the data bus 60 and the CPU bus 57. The mouse 8 is connected to the CPU 56. 

[0156] For embodiments not using active lighting, the frame store need only have a capacity of one field. In the case 
of the video camera 3 described hereinbefore and having a field resolution of 640 x 240 pixels and for a 24 bit RGB 
colour signal, a capacity <rf 640 x 240 x 3 = 460800 bytes is required. For embodiments using active lighting, the frame 
ss store 62 has a capacity of two fields of video data, le: 921 600 bytes. 

[0157] In use, the flash light 55 is synchronised with the video camera 3 and with the video digitiser 59 so that the 
flash light is svrttched on or off at the appropriate time when an Image Is beirig captured. 

[0168] The flash light 55 is used to flash light at the face of the obsen^er so as to improve the unHormity of distnbution. 
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If the flash light 55 is much stronger than the ambient light, the intensity of the lace is largely determined by the flash 
light 65. However, the use o1 a strong light source tends to produce an over-saturated Image. In which many objects 
may be falsely detected as face-like regions. Further, the use of a powerful flashing light may become unpleasant to 
the obsenrar and might cause damage to the eyes. 
s [0159] The flash light 55 should therefore ba of mikJ intensity. In this case, the eflects of ambient light may need to 
be reduced so as to improve the reliability of detecting genuine face-like regions. 

[01601 The method Illustrated In Figure 6 may be modified so as to compare two consecutive frames of video Image 
date in which one is obtained with the flash light 55 illuminated and the other Is obtained with ambient light only. The 
first of these iherefore.contains the effect of both the ambient light and the flash light 56. This first image I (a+f ) may 
10 therefore be considered to comprise two components: 

l(a+f)=l(a)=l(f) 

IS where 1(a) Is the ambient light-only image and 1(f) is the Image which would have been produced If the only light source 
where the flash light 55. This may be rewritten as: 

I{f)=l(a4l).|(a) 

20 

[0161] Thus, by subtracting the image pixel data, the effect of ambient lighting may be reduced or. eliminated so as 
to Improve the reliability and resilience of the face detection method. 



2s Claims 

1. A method of detecting a human face m an Image, comprising locating (17) In the Image a candidate face region 
and analysing (18) the candidate face region for a first characteristic indicative of a facial feature, characterised 
in that the first characteristic comprises a substantially symmetrical horizontal brightness profile comprising a max- 
30 imum (Vb) disposed between first and second minima (V^ , Vg) and In that the analysing step (1 8) comprises forming 
(S41) a vertical integral projection (V(x)) of a portton of the candidate face region and datamriining (S42-S46) 
whether the vertical Integral projection (V(x)) has first and second minima (V^. Vg) disposed substantially symmet- 
rically atjout a maximum (V^). 

as 2. A method as claimed in claim 1 , characterised In that the locating and analysing steps (1 7. 1 8) are repeated for 
each image of a sequence of images. 

3. A method as claimed in claim 1 or 2, characterised In that the or each image is a colour image and the analysing 
step (18) Is performed on a colour component of the colour Image. 

40 

4. A method as claimed In claim 1 or 2, characterised in that the or each image is a colour image and the analysing 
step (1 8) Is perttirmed on a contrast Image derived Irom the colour Image. 

5. A method as claimed In any one of the preceding ciajms, characterised In that the analysing step (1 8) detennlnes 
46 (S44, S45) whether the vertical integral projection (V(x)) has first and second minima (V,, Vg) whoso horizontal 

separation is within a predetermined range. 

6. A method as claimed In any one of the preceding claims, characterised In that the analysing step (18) determines 
(S46. S47) whether the vertical integral projection (V(x)) has a maximum (Vo) and first and second minima (Vi, 

so V2) such that the ratio of the difference between the maximum and the smaller of the first and second minima to 
the maximum is greater than a first threshold. 

.7. A method as claimed in claim 6, characterteed In that vertical Integral projections are formed for a plurality of 
portions of the face candidate and the portion having the highest ratio Is selected as a potential target image. 

S6 

8. A method as claimed in any one of the preceding claims, characterised In that the analysing step (1 8) comprises 
fomiing a measure of the symmetry of the portion. 
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9. A method as claimed .in claim 8, characterised in that the symmetry measure Is formed as: 

^ r=0 

where V(x) is the valued the vertical integral projection at horizontal positbn x and x© is the horizontal position, 
of the middle.of the vertical integral projection. 

10 

10. A method as claimed in claim 8 or 9, characterised in that the vertical integral projection Is formed tor a plurality 
of portions of the lace candidate and the portion having the highest symmetry measure is selected as a potential 
target image. 

IS 11. A method as claimed in anyone of the preceding claims, characterised in that the analysing step (18) comprises 
dividing a portion of the candidate lace region into left and right halves, forming a horizontal integral prolectlon (Hl 
(y). (y)) of each of the halves, and comparing a measure of horizontal symmetry of the left and right horizontal 
integral projections (Hl (y). Hr (y)) with a second threshold. 

20 12. A method as claimed in any one of the preceding claims, characterised in that the analysing step (1 8) determines 
whether the candidate face region has first and second brightness minima disposed at substantially the same 
height with a horizontal separation within a predetermined range. 

13. A method as claimed in claim 12, characterised in that the analysing step (18) determines whether the candidate 
2S face region has a verttoally extending region (PI. 92, P3) of higher brightness.than and disposed between first 

and second brightness minima. 

14. A method as claimed in claim 13, characterised In that the analysing step (18) determines whether the candidate 
face region has a horizontally extending region (P4. P5, P6) disposed below and of lowsr brightness than the 

30 vertically extending region (PI . P2, P3). 

15. A method as claimed in any one of the preceding cla^ns, characterised in that the analysing step (18) comprises 
locating (S60), in the candidate tace region, candidate eye pupil regions where a green image component Is greater 
than a red image component or where a blue image component is greater than a green image component. 

35 

16. A method as claimed in claim 15, characterised in that locating (S60) the candidate eye pupil regions is restricted 
to candidate eye regions of the candidate lace region. 

17. A method as claimed in claim 16. characterised in that the analysing step (1 8) forms a function E(x.y) lor picture 
40 elements 0(>y) inihe candidate eye regions such that: 

{OforJi-G>C,midG-B>C^ 

where R, G and B are red, green and blue image components, C, and O2 ^^e constants' E(x,y) = 1 represents a 
picture element inside the candidate eye pupil regions and E(x,y) = 0 represents a picture element outside the 
so candidate eye pupil regions. 

18. Amelhod as claimed In claim 17. characterised In that the analysing step (18) detects the centres of the eye pupils 
as the centroids of the candidate eye pupil regions. 

56 19. A method as claimed In any one of claims 1 5 to 1 8, characterised in that the analysing step (18) comprises locating 
(S60) a candidate mouth region in a sub-region of the candidate face regbn which is horizontally between the 
candidate eye pupil regions and vertically below the level of the candidate eye pupil regions by between substan- 
tially half and substantially one and a half times the distance between the candidate eye pupil regions. 
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20. A method as claimed in claim 19. characterised in that the analysing step (18) lorms a function M(x.y) for picture 
elements (x,y) within the 8Uk>-reglon such that: 



^ fl for R>G> Band R<TjG 

^ [0 otherwise 

w where R,G and B are red, green and blue image components, is a constant, M(x,y) = 1 represents a picture 

element inside the candidate mouth region and M(x,y) = 0 represents a picture element outside the candidate 
mouth region. 

21. A method as claimed In claim 20, characterised In that vertical and horizontal projection profiles of the function M 
IS (x,y) are formed and a carxiidate lip region is defined in a rectangular sub-region where the vertical and horizontal 

projection profiles exceed first and second predetermined thresholds, respectively 

22. A method as claimed in claim 21 , characterised in that the first and second predetermined thresholds are propor- 
tional to maxima of the vertical and horizontal projection profiles, respectively 

20 

23. A method as claimed in claim 21 or 22. characterised in that the analysing step (16) checks (S61} whether the 
aspect ratb of the candidate lip region is between first and second predefined thresholds. 

24. A method as claimed in any one o1 cbims 21 to 23, characterised in that the analysing step (18) checks (S61) 
2S whether the ratio of the vertical distance from the candidate eye pupil regions to the top of the candidate lip region 

to the spacing between the candidate eye pupil regions is between first and second preset thresholds. 

25. A method as claimed in any one of the preceding claims, characterised in that the analysing step (18) comprises 
dividing a portion of the candidate face region into left and right halves (AEFD, EBCF) and comparing the angles 

^ of the brightness gradient of horizontally symmetrically disposed pairs of points for synrvnetry * 

2S. A method as claimed in claim 2 or in any one of claims 3 to 25 when dependent on claim 2, characterised in that 
the kscatlng and analysing steps (17, 18) are stopped (S53) when the first characterlstk: is found r times In R 
consecutive images of the sequence. 

3S 

27. A method as claimed in any one of the preceding claims, characterised in that the kxsating step (17) comprises 
searching the image for a candklate face region having a second characteristic Indicative of a human lace. 

28. A method as claimed in claim 27, characterised in that the second characteristic is uniform saturation. 

40 

29. A method as claimed in claim 28, characterised in that the searching step comprises reducing (S22) the resolution 
of the image by averaging the saturatton to fonrt a reduced resolutton Image and searching (S23) for a region of 
the reduced resolufion image having, in a predetermined shape, a substantially uniform saturation which is sub- 
stantially different from the saturatksn of the portion of the reduced resolution image surrounding the predetermined 

^ shape. 

30. A method as claimed in claim 29, characterised In thai the image comprises a plurality of picture elements and 
the resolution Is reduced such that the predetermined shape is from two to three reduced resolution picture ele- 
ments across. 

so 

31. A method as claimed in claim 30, characterised in that the image comprises a rectangular anay (30) of M by N 
picture elements, the reduced reeolutksn Image (31) comprises (M/ni) by (N/n) pk:iure elements, each of which 
corresponds to m by n picture elements of the image, and the saturation P of each picture element of the reduced 
resolution Image is given by: 

S6 
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m-1 n-\ 

5 

Where t(I.J) Is the saturation of the picture element of the Ith column and the |th row of the m by n picture elements. 

32. A method of claimed In claim 31 , characterised by storing the saturations in a store. 

10 

33. A method as claimed in claim 31 or 32. characterised In that a uniform Ity value (u) lis ascribed to each of the reduced 
resolution picturs elements by comparing the satuiatlon of each of the reduced resolution picture elements with 
the saturation of at least one adjacent reduced resolution picturs element. 

IS 34. A method as claimed in claim 33, characterised in that each uniformity vaiua (u) is ascribed a first value if 

(max(P)-min(Pymax{P)^ where max(P) and min(P) are the maximum and minimum values, respectively, 
of the saturations of the reduced resolution picture element and the or each adjacent picture element and T is a 
threshold, and a second value different from the first value othenvise 

^0 35. A metirod as claimed In claim 34, characterised in that T is substantially equal to 0.15. 

36. A method as claimed in any one o1 claims 33 to 35 when dependent on claim 32, characterised in that the or each 
adjacent reduced resolution picture element has not been ascribed a uniformity value and each unllormlty value 
is stored in the store in place of the corresponding saturation. 

2S 

37. A method as claimed in claim 34 or 35 or in claim 36 when dependent on cialm 34. characterised in that the 
resolution is reduced such that the predetermined shape Is two or three reduced resoiulion picture elements across 
and characterised In that the method further comprises indicating detection of a candidate lace region when a 
uniformity value of the first value is ascribed to any o1 one reduced resolution picture element, two vertically or 

30 horizomalty adjacent reduced resolution picture elements and a rectangular two-by-two array of picture elements 
and when a uniformity value of the second value is ascribed to each surrounding reduced resolution picture ele- 
ment. 

38. A method as claimed in claim 37 when dependent on claim 32, characterised in that detection is indicated by 
3S . storing a third value different from the first and second values In the store in place of the corresponding uniformity 

value. 

39. A method as claimed In ariy one of claims 30 to 38, characterised by repeating the resolution reduction and search- 
ing at least once with the reduced resolution picture elements shifted with respect to the image picturs elements. 

40 

40. A method as claimed in any one of claims 29 to 39, characterised in that the saturation is derived from red, green 
and blue components as 

(max(R,G,B}-min(R,G,B))/max(R,G,B) where max(R,G,B) and mln(R,G,B) are the rraximum and minbnum values, 
respectively, of the red, green and blue components. 

41. A method as claimed in any one of the preceding claims, characterised in that a first image is captured while 
illuminating an expected range of positions of a lace, a second image Is captured using ambient light, and the 
second image is subtracted from the first image to form the image. 

so 42. An apparatus for detecting a human face in an Image, characterised by means for locating In the innage a candidate 
face region and means for analysing the candidate face region for a first characteristic indicative of a facial feature. 

43. An observer tracldng display characterised by an apparatus as claimed in claim 42. 

55 
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